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ABSTRACT 

Signs  are  everywhere  in  our  lives.  They  make  our  lives  easier 
when  we  are  familiar  with  them.  But  sometimes  they  also  pose 
problems.  For  example,  a  tourist  might  not  be  able  to  understand 
signs  in  a  foreign  country.  In  this  paper,  we  present  our  efforts 
towards  automatic  sign  translation.  We  discuss  methods  for 
automatic  sign  detection.  We  describe  sign  translation  using 
example  based  machine  translation  technology.  We  use  a  user- 
centered  approach  in  developing  an  automatic  sign  translation 
system.  The  approach  takes  advantage  of  human  intelligence  in 
selecting  an  area  of  interest  and  domain  for  translation  if  needed. 
A  user  can  determine  which  sign  is  to  be  translated  if  multiple 
signs  have  been  detected  within  the  image.  The  selected  part  of 
the  image  is  then  processed,  recognized,  and  translated.  We  have 
developed  a  prototype  system  that  can  recognize  Chinese  signs 
input  from  a  video  camera  which  is  a  common  gadget  for  a  tourist, 
and  translate  them  into  English  text  or  voice  stream. 
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1.  INTRODUCTION 

Languages  play  an  important  role  in  human  communication. 
We  communicate  with  people  and  information  systems 
through  diverse  media  in  increasingly  varied  environments. 
One  of  those  media  is  a  sign.  A  sign  is  something  that 
suggests  the  presence  of  a  fact,  condition,  or  quality.  Signs 
are  everywhere  in  our  lives.  They  make  our  lives  easier 
when  we  are  familiar  with  them.  But  sometimes  they  also 
pose  problems.  For  example,  a  tourist  might  not  be  able  to 
understand  signs  in  a  foreign  country.  Unfamiliar  language 
and  environment  make  it  difficult  for  international  tourists 
to  read  signs,  take  a  taxi,  order  food,  and  understand  the 
comments  of  passersby. 

At  the  Interactive  Systems  Lab  of  Carnegie  Mellon 
University,  we  are  developing  technologies  for  tourist 


applications  [12].  The  systems  are  equipped  with  a  unique 
combination  of  sensors  and  software.  The  hardware 
includes  computers,  GPS  receivers,  lapel  microphones  and 
earphones,  video  cameras  and  head-mounted  displays.  This 
combination  enables  a  multimodal  interface  to  take 
advantage  of  speech  and  gesture  inputs  to  provide 
assistance  for  tourists.  The  software  supports  natural 
language  processing,  speech  recognition,  machine 
translation,  handwriting  recognition  and  multimodal  fusion. 
A  vision  module  is  trained  to  locate  and  read  written 
language,  is  able  to  adapt  to  new  environments,  and  is  able 
to  interpret  intentions  offered  by  the  user,  such  as  a  spoken 
clarification  or  pointing  gesture. 

In  this  paper,  we  present  our  efforts  towards  automatic  sign 
translation.  A  system  capable  of  sign  detection  and 
translation  would  benefit  three  types  of  individuals:  tourists, 
the  visually  handicapped  and  military  intelligence.  Sign 
translation,  in  conjunction  with  spoken  language 
translation,  can  help  international  tourists  to  overcome  these 
barriers.  Automatic  sign  recognition  can  help  us  to  increase 
environmental  awareness  by  effectively  increasing  our  field 
of  vision.  It  can  also  help  blind  people  to  extract 
information.  A  successful  sign  translation  system  relies  on 
three  key  technologies:  sign  extraction,  optical  character 
recognition  (OCR),  and  language  translation.  Although 
much  research  has  been  directed  to  automatic  speech 
recognition,  handwriting  recognition,  OCR,  speech  and  text 
translation,  little  attention  has  been  paid  to  automatic  sign 
recognition  and  translation  in  the  past.  Our  current  research 
is  focused  on  automatic  sign  detection  and  translation  while 
taking  advantage  of  OCR  technology  available.  We  have 
developed  robust  automatic  sign  detection  algorithms.  We 
have  applied  Example  Based  Machine  Translation  (EBMT) 
technology  [1]  in  sign  translation. 

Fully  automatic  extraction  of  signs  from  the  environment  is 
a  challenging  problem  because  signs  are  usually  embedded 
in  the  environment.  Sign  translation  has  some  special 
problems  compared  to  a  traditional  language  translation 
task.  They  can  be  location  dependent.  The  same  text  on 
different  signs  can  be  treated  differently.  For  example,  it  is 
not  necessary  to  translate  the  meanings  for  names,  such  as 
street  names  or  company  names,  in  most  cases.  In  the 
system  development,  we  use  a  user-centered  approach.  The 
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approach  takes  advantage  of  human  intelligence  in  selecting 
an  area  of  interest  and  domain  for  translation  if  needed.  For 
example,  a  user  can  determine  which  sign  is  to  be  translated 
if  multiple  signs  have  been  detected  within  the  image.  The 
selected  part  of  the  image  is  then  processed,  recognized, 
and  translated,  with  the  translation  displayed  on  a  hand-held 
wearable  display,  or  a  head  mounted  display,  or  synthesized 
as  a  voice  output  message  over  the  earphones.  By  focusing 
only  on  the  information  of  interest  and  providing  domain 
knowledge,  the  approach  provides  a  flexible  method  for 
sign  translation.  It  can  enhance  the  robustness  of  sign 
recognition  and  translation,  and  speed  up  the  recognition 
and  translation  process.  We  have  developed  a  prototype 
system  that  can  recognize  Chinese  sign  input  from  a  video 
camera  which  is  a  common  gadget  for  a  tourist,  and 
translate  the  signs  into  English  text  or  voice  stream. 

The  organization  of  this  paper  is  as  follows:  Section  2 
describes  challenges  in  sign  recognition  and  translation. 
Section  3  discusses  methods  for  sign  detection.  Section  4 
addresses  the  application  of  EBMT  technology  into  sign 
translation.  Section  5  introduces  a  prototype  system  for 
Chinese  sign  translation.  Section  6  gives  experimental 
results.  Section  Tconcludes  the  paper. 

2.  PROBLEM  DESCRIPTION 

A  sign  can  be  a  displayed  structure  bearing  letters  or 
symbols,  used  to  identify  or  advertise  a  place  of  business.  It 
can  also  be  a  posted  notice  bearing  a  designation,  direction, 
or  command.  Figure  1  and  Figure  2  illustrate  two  examples 
of  signs.  Figure  1  shows  a  Russian  sign  completely 
embedded  in  the  background.  Figure  2  is  a  sign  that 
contains  German  text  with  no  verb  and  article.  In  this 
research,  we  are  interested  in  translating  signs  that  have 
direct  influence  upon  a  tourist  from  a  different  country  or 
culture.  These  signs,  at  least,  include  the  following 
categories: 

•  Names:  street,  building,  company,  etc. 

•  Information:  designation,  direction,  safety 

advisory,  warning,  notice,  etc. 

•  Commercial:  announcement,  advertisement,  etc. 

•  Traffic:  warning,  limitation,  etc. 

•  Conventional  symbol:  especially  those  are 
confusable  to  a  foreign  tourist,  e.g.,  some  symbols 
are  not  international. 

Fully  automatic  extraction  of  signs  from  the  environment  is 
a  challenging  problem  because  signs  are  usually  embedded 
in  the  environment.  The  related  work  includes  video  OCR 
and  automatic  text  detection.  Video  OCR  is  used  to  capture 
text  in  the  video  images  and  recognize  the  text.  Many 
video  images  contain  text  contents.  Such  text  can  come 
from  computer-generated  text  that  is  overlaid  on  the 
imagery  (e.g.,  captions  in  broadcast  news  programs)  or  text 
that  appears  as  a  part  of  the  video  scene  itself  (e.g.,  a  sign 
outside  a  place  of  business,  or  a  post).  Location  and 


recognition  of  text  in  video  imagery  is  challenging  due  to 
low  resolution  of  characters  and  complexity  of  background. 
Research  in  video  OCR  has  mainly  focused  on  locating  the 
text  in  the  image  and  preprocessing  the  text  area  for  OCR 
[4]  [6]  [7]  [9]  [10].  Applications  of  the  research  include 
automatically  identifying  the  contents  of  video  imagery  for 
video  index  [7]  [9],  and  capturing  documents  from  paper 
source  during  reading  and  writing  [10].  Compared  to  other 
video  OCR  tasks,  sign  extraction  takes  place  in  a  more 
dynamic  environment.  The  user’s  movement  can  cause 
unstable  input  images.  Non-professional  equipment  can 
make  the  video  input  poorer  than  that  of  other  video  OCR 
tasks,  such  as  detecting  captions  in  broadcast  news 
programs.  In  addition,  sign  extraction  has  to  be 
implemented  in  real  time  using  limited  resources. 


Figure  1  A  sign  embedded  in  the  background 


LKW-Parkplto 

ab  16.6  6Uhr 
gesperrt 


Figure  2  A  German  sign 

Sign  translation  requires  sign  recognition.  A  straightforward 
idea  is  to  use  advanced  OCR  technology.  Although  OCR 
technology  works  well  in  many  applications,  it  requires 
some  improvements  before  it  can  be  applied  to  sign 
recognition.  At  current  stage  of  the  research,  we  will  focus 
our  research  on  sign  detection  and  translation  while  taking 
advantage  of  state-of-the-art  OCR  technologies. 


Sign  translation  has  some  special  problems  compared  to  a 
traditional  language  translation  task.  The  function  of  signs 
lead  to  the  characteristic  of  the  text  used  in  the  sign:  it  has 
to  be  short  and  concise.  The  lexical  mismatch  and  structural 
mismatch  problems  become  more  severe  in  sign  translation 
because  shorter  words/phrases  are  more  likely  to  be 
ambiguous  and  insufficient  information  from  the  text  to 
resolve  the  ambiguities  which  are  related  to  the 
environment  of  the  sign. 

We  assume  that  a  tourist  has  a  video  camera  to  capture 
signs  into  a  wearable  or  portable  computer.  The  procedure 
of  sign  translation  is  as  follows:  capturing  the  image  with 
signs,  detecting  signs  in  the  image,  recognizing  signs,  and 
translating  results  of  sign  recognition  into  target  language. 

3.  AUTOMATIC  SIGN  DETECTION 

Fully  automatic  extraction  of  signs  from  the  environment  is 
very  difficult,  because  signs  are  usually  embedded  in  the 
environment.  There  are  many  challenges  in  sign  detection, 
such  as  variation,  motion  and  occlusion.  We  have  no 
control  in  font,  size,  orientation,  and  position  of  sign  texts. 
Originating  in  3-D  space,  text  on  signs  in  scene  images  can 
be  distorted  by  slant,  tilt,  and  shape  of  objects  on  which 
they  are  found  [8].  In  addition  to  the  horizontal  left-to-right 
orientation,  other  orientations  include  vertical,  circularly 
wrapped  around  another  object,  slanted,  sometimes  with  the 
characters  tapering  (as  in  a  sign  angled  away  from  the 
camera),  and  even  mixed  orientations  within  the  same  text 
area  (as  would  be  found  on  text  on  a  T-shirt  or  wrinkled 
sign).  Unlike  other  text  detection  and  video  OCR  tasks,  sign 
extraction  is  in  a  more  dynamic  environment.  The  user’s 
movement  can  cause  unstable  input  images.  Furthermore, 
the  quality  of  the  video  input  is  poorer  than  that  of  other 
video  OCR  tasks,  such  as  detecting  captions  in  broadcast 
news  programs,  because  of  low  quality  of  equipment. 
Moreover,  sign  detection  has  to  be  real-time  using  a  limited 
resource.  Though  automatic  sign  detection  is  a  difficult 
task,  it  is  crucial  for  a  sign  translation  system. 

We  use  a  hierarchical  approach  to  address  these  challenges. 
We  detect  signs  at  three  different  levels.  At  the  first  level, 
the  system  performs  coarse  detection  by  extracting  features 
from  edges,  textures,  colors/intensities.  The  system 
emphasizes  robust  detection  at  this  level  and  tries  to 
effectively  deal  with  the  different  conditions  such  as 
lighting,  noise,  and  low  resolution.  A  multi-resolution 
detection  algorithm  is  used  to  compensate  different  lighting 
and  low  contrasts.  The  algorithm  provides  hypotheses  of 
sign  regions  for  a  variety  of  scenes  with  large  variations  in 
both  lighting  condition  and  contrast.  At  the  second  level, 
the  system  refines  the  initial  detection  by  employing  various 
adaptive  algorithms.  The  system  focuses  on  each  detected 
area  and  makes  elaborate  analysis  to  guarantee  reliable  and 
complete  detection.  In  most  cases,  the  adaptive  algorithms 
can  lead  to  finding  the  regions  without  missing  any  sign 


region.  At  the  third  level,  the  system  performs  layout 
analysis  based  on  the  outcome  from  the  previous  levels. 
The  design  and  layout  of  signs  are  language  and  culture 
dependent.  For  example,  many  Asia  languages,  such  as 
Chinese  and  Japanese,  have  two  types  of  layout:  the 
horizontal  and  the  vertical.  The  system  provides 
considerable  flexibility  to  allow  the  detection  of  slanted 
signs  and  signs  with  non-uniform  character  sizes. 


4.  SIGN  TRANSLATION 

Sign  translation  has  some  special  problems  compared  to  a 
traditional  language  translation  task.  Sign  translation 
depends  not  only  on  domain  but  also  on  functionality  of  the 
sign.  The  same  text  on  different  signs  can  be  treated 
differently.  In  general,  the  text  used  in  the  sign  is  short  and 
concise.  For  example,  the  average  length  of  each  sign  in  our 
Chinese  sign  database  is  6.02  Chinese  characters.  The 
lexical  mismatch  and  structural  mismatch  problems  become 
more  severe  for  sign  translation  because  shorter 
words/phrases  are  more  likely  to  be  ambiguous  and  there 
isn’t  sufficient  information  from  the  text  to  resolve  the 
ambiguities  which  are  related  to  the  environment  of  the 
sign.  For  example,  in  order  to  make  signs  short, 

abbreviations  are  widely  used  in  signs,  e.g.,  (/ji 

yan  suo/)  is  the  abbreviation  for  ,  (/ji 

sheng  chong  yan  jiu  suo/  institute  of  parasites),  such 
abbreviations  are  difficult,  if  not  impossible,  even  for  a 
human  to  understand  without  knowledge  of  the  context  of 
the  sign.  Since  designers  of  signs  always  assume  that 
readers  can  use  the  information  from  other  sources  to 
understand  the  meaning  of  the  sign,  they  tend  to  use  short 


words,  e.g.  in  sign  I  J  (/man  xing/,  drive  slowly),  the 
word  TT  (/xing/,  walk,  drive)  is  ambiguous,  it  can  mean 

fr^  (/xing  zou/  “move  of  human,”  walk)  or 
“move  of  a  car,”  drive).  The  human  reader  can  understand 
the  meaning  if  he  knows  it  is  a  traffic  sign  for  cars,  but 
without  this  information,  MT  system  cannot  select  the 
correct  translation  for  this  word.  Another  problem  in  sign  is 
structural  mismatch.  Although  this  is  one  of  the  basic 
problems  for  all  MT  systems,  it  is  more  serious  in  sign 
translation:  some  grammatical  functions  are  omitted  to 
make  signs  concise.  Examples  include:  (1)  the  subject  “we” 

is  omitted  in  mm  (/li  mao  dai  ke/,  treat  customers 
politely);  (2)  the  sentence  is  reordered  to  emphasize  the 

topic:  rather  than  saying 

immmxmm  (/qing  jiang  bao  zhuang  zhi 
tou  ru  la  ji  xiang/,  please  throw  wrapping  paper  into  the 

garbage  can),  using  (/bao 

zhuang  zhi  qing  tou  ru  la  ji  xiang/,  wrapping  paper,  please 
throw  it  into  the  garbage  can)  to  highlight  the  “wrapping 
paper.”  With  these  special  features,  sign  translation  is  not  a 


trivial  problem  of  just  using  existing  MT  technologies  to 
translate  the  text  recognized  by  OCR  module. 

Although  a  knowledge-based  MT  system  works  well  with 
grammatical  sentences,  it  requires  a  great  amount  of  human 
effort  to  construct  its  knowledge  base,  and  it  is  difficult  for 
such  a  system  to  handle  ungrammatical  text  that  appears 
frequently  in  signs. 

We  can  use  a  database  search  method  to  deal  with  names, 
phrases,  and  symbols  related  to  tourists.  Names  are  usually 
location  dependent,  but  they  can  be  easily  obtained  from 
many  information  sources  such  as  maps  and  phone  books. 
Phrases  and  symbols  related  to  tourists  are  relative  fixed  for 
a  certain  country.  The  database  of  phrases  and  symbols  is 
relatively  stable  once  it  is  built 

We  propose  to  apply  Generalized  Example  Based  Machine 
Translation  (GEBMT)  [1][2]  enhanced  with  domain 
detection  to  a  sign  translation  task.  This  is  a  data-driven 
approach.  What  EBMT  needs  are  a  set  of  bilingual  corpora 
each  for  one  domain  and  a  bilingual  dictionary  where  the 
latter  can  be  constructed  statistically  from  the  corpora. 
Matched  from  the  corpus,  EBMT  can  give  the  same  style  of 
translations  as  the  corpus.  The  domain  detection  can  be 
achieved  from  other  sources.  Eor  example,  shape/color  of 
the  sign  and  semantics  of  the  text  can  be  used  to  choose  the 
domain  of  the  sign. 

We  will  start  with  the  EBMT  software  [1].  The  system  will 
be  used  as  a  shallow  system  that  can  function  using  nothing 
more  than  sentence-aligned  plain  text  and  a  bilingual 
dictionary;  and  given  sufficient  parallel  text,  the  dictionary 
can  be  extracted  statistically  from  the  corpus.  In  a 
translation  process,  the  system  looks  up  all  matching 
phrases  in  the  source-language  half  of  the  parallel  corpus 
and  performs  a  word-level  alignment  on  the  entries 
containing  matches  to  determine  a  (usually  partial) 
translation.  Portions  of  the  input  for  which  there  are  no 
matches  in  the  corpus  do  not  generate  a  translation. 
Because  the  EBMT  system  does  not  generate  translations 
for  100%  of  its  input  text,  a  bilingual  dictionary  and  phrasal 
glossary  are  used  to  fill  any  gaps.  Selection  of  the  “best” 
translation  is  guided  by  a  trigram  model  of  the  target 
language  and  a  chart  table  [3]. 

5.  A  PROTOTYPE  SYSTEM 

We  have  developed  a  prototype  system  for  Chinese  sign 
recognition  and  translation.  Eigure  3  shows  the  architecture 
of  the  prototype  system.  A  user  can  interactively  involve 
sign  recognition  and  translation  process  when  needed.  Eor 
example,  a  user  can  select  the  area  of  interest,  or  indicate 
that  the  sign  is  a  street  name.  The  system  works  as  follows. 
The  system  captures  the  sign  in  a  natural  background  using 
a  video  camera.  The  system  then  automatically  detects  or 
interactively  selects  the  sign  region.  The  system  performs 


sign  recognition  and  translation  within  the  detected/selected 
region.  It  first  preprocesses  the  selected  region,  binarizes 
the  image  to  get  text  or  symbol,  and  feeds  the  binary  image 
into  the  sign  recognizer.  OCR  software  from  a  third  party  is 
used  for  text  recognition.  The  recognized  text  is  then 
translated  into  English.  The  output  of  the  translation  is  fed 
to  the  user  by  display  on  screen  or  synthesized  speech. 
Eestival,  a  general  purpose  multi-lingual  text-to-speech 
(TTS)  system  is  used  for  speech  synthesis. 


Figure  3  Architecture  of  the  prototype  system 


Figure  4  The  interface  of  the  prototype  system 

An  efficient  user  interface  is  important  to  a  user-centered 
system.  Use  of  interaction  is  not  only  necessary  for  an 
interactive  system,  but  also  useful  for  an  automatic  system. 
A  user  can  select  a  sign  from  multiple  detected  signs  for 
translation,  and  get  involved  when  automatic  sign  detection 
is  wrong.  Figure  4  is  the  interface  of  the  system.  The 
window  of  the  interface  displays  the  image  from  a  video 
camera.  The  translation  result  is  overlaid  on  the  location  of 


the  sign.  A  user  can  select  the  sign  text  using  pen  or  mouse 
anywhere  in  the  window. 

6.  EXPERIMENTAL  RESULTS 

We  have  evaluated  the  prototype  system  for  automatic  sign 
detection  and  translation.  We  have  built  a  Chinese  sign 
database  with  about  800  images  taken  from  China  and 
Singapore.  We  have  tested  the  automatic  detection  module 
using  50  images  randomly  selected  from  the  database. 
Table  1  shows  the  test  result  of  automatic  sign  detection. 
Figure  5  and  Figure  6  show  examples  of  automatic  sign 
detection  with  white  rectangles  indicating  the  sign  regions. 
Figure  5  shows  correct  detection  after  layout  analysis. 
Figure  6  illustrates  a  result  with  a  false  detection  (Note  the 
small  detection  box  below  and  to  the  left  of  the  larger 
detection). 


Table  1  Test  Results  of  Automatic  Detection  on  50 
Chinese  Signs 


Detection 
without  missing 
characters 

Detection 
with  false 
alarm 

Detection  with 
missing  characters 

43 

12 

5 

Figure  5  An  example  of  automatic  sign  detection 


Figure  6  An  example  of  false  detection 

Figure  7  illustrates  two  difficult  examples  of  sign  detection. 
The  text  in  Figure  7(a)  is  easily  confused  with  the  reflective 


background.  The  sign  in  Figure  7(b)  is  embedded  in  the 
background 


(a)  (b) 

Figure  7  Difficult  examples  of  sign  detection 

We  have  also  tested  the  EBMT  based  method.  We  assume 
perfect  sign  recognition  in  our  test.  We  randomly  selected 
50  signs  from  our  database.  We  first  tested  the  system 
includes  a  Chinese-English  dictionary  from  the  Linguistic 
Data  Consortium,  and  a  statistical  dictionary  built  from  the 
HKLC  (Hong  Kong  Legal  Code)  corpus.  As  a  result,  we 
only  got  about  30%  reasonable  translations.  We  then 
trained  with  a  small  corpus  of  670  pairs  of  bilingual 
sentences  [7],  The  accuracy  is  improved  from  30%  to  52% 
on  50  test  signs.  Some  examples  of  errors  are  illustrated 
below: 

Mis-segmentaion: 

Chinese  with  wrong  segmentation: 

/ge  zhong  che  bang  qing  rao  xing/ 

Translation  from  MT: 

All  vehicles  are  please  wind  profession 

Correct  segmentation: 

Translation  if  segmentation  is  correct: 

All  vehicles  please  use  detour 

Lack-domain  information: 

Chinese  with  segmentation: 

/qing  wu  dong  shou/ 

Please  don't  touch  it 

Translation  from  MT: 

Please  do  not  get  to  work 

Domain  knowledge  needed  to  translate 
“start  to  work”  in  domain  such  as  work  plan  and 
“don’t  touch”  in  domains  like  tourism,  exhibition 
etc. 

Proper  Name: 

Chinese  with  segmentation: 

[dM  mu 

/bei  jing  tong  ren  yi  yuan/ 


Beijing  Tongren  Hospital 

Translation  from  MT: 

Beijing  simiiar  humane  hospitai 

at  is  translated  to  the  meaning  of  each 
character  because  it  is  not  identified  as  a  proper 
name  which  then  should  only  be  represented  by  its 
pronunciation. 

Figure  8  illustrates  error  analysis  of  the  translation  module. 
It  is  interesting  to  note  that  40%  of  errors  come  from  mis- 
segmentation  of  words.  There  is  a  big  room  for 
improvement  in  proper  word  segmentation.  In  addition,  we 
can  take  advantage  of  the  contextual  information  provided 
by  the  OCR  module  to  further  improve  the  translation 
quality. 


Error  source 


□  Mis-segmentation 

□  Lack  domain  knowledge 

□  Mis-detection  of  proper 
name 

□  Corpus/dict  not  large  enough 
■  Other 


Figure  8  Error  analysis  of  the  translation  module 

7.  CONCLUSION 

We  have  reported  progress  on  automatic  sign  translation  in 
this  paper.  Sign  translation,  in  conjunction  with  spoken 
language  translation,  can  help  international  tourists  to 
overcome  language  barriers.  A  successful  sign  translation 
system  relies  on  three  key  technologies:  sign  extraction, 
OCR,  and  language  translation.  We  have  developed 
algorithms  for  robust  sign  detection.  We  have  applied 
EBMT  technology  for  sign  translation.  We  have  employed 
a  user-centered  approach  in  developing  an  automatic  sign 
translation  system.  The  approach  takes  advantage  of  human 
intelligence  in  selecting  an  area  of  interest  and  domain  for 
translation  if  needed.  We  have  developed  a  prototype 
system  that  can  recognize  Chinese  signs  input  from  a  video 
camera  which  is  a  common  gadget  for  a  tourist,  and 
translate  them  into  English  text  or  voice  stream. 
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